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ABSTRACT 

We present a stochastic approach to the spatial clustering of dark matter 
haloes in Lagrangian space. Our formalism is based on a local formulation 
of the 'excursion set' approach by Bond et al., which automatically accounts 
for the 'cloud-in-cloud' problem in the identification of bound systems. Our 
method allows to calculate correlation functions of haloes in Lagrangian space 
using either a multi-dimensional Fokker-Planck equation with suitable bound- 
ary conditions or an array of Langevin equations with spatially correlated 
random forces. We compare the results of our method with theoretical pre- 
dictions for the halo auto-correlation function considered in the literature and 
find good agreement with the results recently obtained within a treatment of 
halo clustering in terms of 'counting fields' by Catelan et al.. The possible 
effect of spatial correlations on numerical simulations of halo merger trees is 
finally discussed. 

Key words: galaxies: clustering - cosmology: theory - large-scale structure 
of Universe. 



1 INTRODUCTION 

The celebrated Press & Schechter (1974, hereafter PS) theory represents a fundamental tool to determine the mean 
mass distribution of bound condensations, nowadays identified as dark matter (DM) haloes, in the Universe. Recently 
a number of authors (e.g. Efstathiou et al. 1988; Cole & Kaiser 1989; Mo & White 1996; Mo, Jing & White 1996; 
Catelan et al. 1997, hereafter CLMP) went beyond this application, extending the PS method to study halo clustering 
both in Lagrangian and Eulerian space. 

The clustering problem can be dealt with in two steps: i) identifying the preferential sites for halo formation in 
Lagrangian space and ii) providing a theoretical framework to follow the non-linear dynamics of the halo distribution. 
The former step can be accurately modeled within linear theory by the PS scheme, which in fact suggests a simple 
criterion to assign each Lagrangian point to a matter clump of some mass M which is going to collapse at some 
formation redshift Zf. The latter step, instead, requires some knowledge of the non-linear matter dynamics, which is 
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ultimately responsible for the actual Eulerian distribution of matter clumps as they form by the hierarchical process 
of matter accretion and merging of subunits. A relevant progress in this direction has been made by Mo & White 
(1996), who showed that a linear local bias relation connects the Eulerian halo auto-correlation function to that of the 
mass, thus providing a good fit to the two-point function of haloes in N-body simulations. A more refined, non-linear 
and non-local relation has boon recently obtained by CLMP. 

At the Lagrangian level, however, one has to deal with the so-called cloud-in-cloud problem, inherent in the PS 
scheme, namely the fact that their approach selects bound systems of given mass that can have been already included 
in larger mass condensations of the same catalogue. A straightforward solution of this problem in connection with 
halo correlations was sketched by CLMP, who adopted the 'peak-background' split (e.g. Bardeen et al. 1986; Cole 
& Kaiser 1989) to solve the problem up to the resolution scale of the 'background' mass component, which is also 
responsible for the motion of the haloes from their original Lagrangian positions. 

The transition from the Lagrangian to the Eulerian halo distribution can be dealt with exactly thanks to the 
continuity equation, as shown by CLMP, who obtained an analytic formula for the Eulerian 'bias field' connecting the 
halo number density fiuctuations to the mass perturbations. This relation possesses the remarkable feature of being 
non-linear and non-local, as it depends on the halo overdensity at the original Lagrangian positions. Therefore, the 
statistical halo distribution in Lagrangian space, which can also be described in terms of a hierarchy of bias factors 
(cf. CLMP), plays the role of the initial condition in generating the Eulerian halo density field. 

Aim of the present paper is to provide an exact treatment of the halo correlation properties in the Lagrangian 
world, which extends and puts on sounder bases the results obtained by Mo & White (1996), as already sketched in 
CLMP. Our approach here is entirely based on the excursion set version of the PS theory (Peacock & Heavens 1990; 
Cole 1991; Bond et al. 1991, hereafter BCEK), which allows to define halo populations free of the cloud-in-cloud 
problem on all relevant scales. 

Our formalism has to be considered in every respect as the natural extension of the BCEK model to include the 
spatial clustering properties of the haloes. Once a cosmological model and a power-spectrum of density fluctuations 
are selected the present formalism allows to construct halo correlations of any order in Lagrangian space. Here we 
only give explicit results for the two-point function, but the generalization to higher order would be straightforward. 

An obvious and important application of this type of study aims at modeling galaxy clustering at different 
redshifts. The key idea being that the spatial distribution of DM haloes can be a clue for understanding the clustering 
properties of luminous objects like galaxies with different physical properties and at various epochs (e.g. Kauffmann, 
Nusser & Steinmetz 1997; Matarrese et al. 1997; Moscardini et al. 1997). The actual relation between the parent 
DM haloes and the galaxies, however, depends on the way haloes accrete mass and on how they grow by the process 
of merging with the surrounding haloes. These processes are usually studied in terms of the so-called 'halo merger 
trees', obtained either by Monte Carlo methods (e.g. Cole 1991; Lacey & Cole 1993; Kauffmann & White 1993; Tozzi, 
Governato & Cavaliere 1996; Somerville & Kolatt 1997; Salvador-Sole, Solanes & Manrique 1997 and references 
therein) or by directly looking at the dynamics of particles in cosmological N-body simulations (e.g. Kauffmann et 
al. 1997; Roukema et al. 1997; Governato et al. 1997). Realizations of halo merger trees can be in fact obtained by 
Monte Carlo techniques in the frame of the excursion set theory, where halo accretion histories are associated to 
random walks. The key idea is to average the linear density field over spheres of decreasing size centred on each 
Lagrangian point, thus defining a set of trajectories of the smoothed field as a function of the resolution scale. At 
each cosmic epoch the process of halo formation is here modeled as the upcrossing of a threshold by these random 
trajectories. Therefore, every trajectory gives a detailed description of the mass accretion history of the corresponding 
halo. A merger tree is then obtained by suitably connecting the accretion histories of all the progenitors of a given 
halo (which, in the tree-jargon, are usually called 'branches'). 

In the latter approach, however, spatial correlations between different branches of the tree as well as correlations 
between different trees are completely ignored. The existence of this shortcoming has been also noticed by Rodrigues & 
Thomas (1996) in connection with the so-called 'block model' (Cole & Kaiser 1988; Cole 1991) , which can be considered 
as a rudimentary form of merging tree, where haloes can only grow by doubling their mass. Yano, Nagashima & Gouda 
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Figure 1. Examples of trajectories obtained by smoothing the linear density fluctuation field with a series of sharp fc-space 
filters (with decreasing resolution scales Rf = l/fc/) at two points separated by the distance r. Here we consider r = 4Mpc 
and a CDM power spectrum (with density parameter H = 1 and present Hubble constant Ho = 50kms~^ Mpc~^) linearly 
extrapolated to erg = 1. The heavy continuous line represents the trajectory associated to a point x in Lagrangian space. The 
trajectory associated to the point y such that |y — x| = r is plotted with a light continuous line. The dotted line is obtained by: 
i) considering the trajectory associated to y, ii) artificially removing any correlation with the trajectory at x and Hi) suitably 
rescaling the result. In practice, the dotted line represents a trajectory which is completely independent of the one associated 
to X, but which has the same statistical one-point properties. This is what is generally used in building up merger trees of dark 
matter haloes. However, the actual trajectories associated to neighbouring points are strongly correlated when the smoothing 
length Rf is comparable to the physical separation between the points r. The same trajectories tend to become less and less 
correlated a.s Rf decreases. It is interesting to note that a memory of the strong correlation existing for r/Rf <S 1 remains even 
when r/Rf 3> 1. In fact, when Rf is small compared to r, the real trajectory associated to y can be accurately approximated 
by adding a constant value to the independent trajectory so that it matches the random walk at x when Rf^ r. In other words, 
since the trajectories are continuous functions of Rf, the value assumed at Rf ~ r (i.e. at the end of the strongly correlated 
regime) represents the initial condition for the random walk at small Rf (i.e. in the independent regime). This derives from a 
sort of natural peak-background split; by evaluating the density fluctuation fleld in two points separated by r one samples the 
same background value (obtained by summing up all the Fourier components corresponding to wavelenghts larger than r) but 
different high frequency modes. 



(1996) and Nagashima & Gouda (1997) analysed the possible changes induced by spatial correlations on the PS mass 
function. 

The problem within the standard Monte Carlo approach is that each trajectory is treated as being independent of 
all the others. In a general Gaussian field, however, this situation is not realized and, for power-spectra of cosmological 
interest, the density fluctuations turn out to be strongly correlated on small scales. This statistical property can be 
clearly detected by looking at a pair of trajectories associated to two neighbouring points. In Fig. ^ we show two 
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trajectories obtained by smoothing a correlated field with a sharp fc-space filter. When the smoothing length is much 
larger than the separation between the points in which the trajectories are computed the density contrast assumes 
values that are practically coincident. On the other hand, for filtering radii much smaller than the lag that separates 
the points, the trajectories become indeed uncorrelated. The transition from the completely correlated situation to 
the totally uncorrelated one is continuous, i.e. the trajectories become less and less correlated as the smoothing 
length decreases. For comparison, in Fig. ^ we also show the same trajectories modified by artificially removing any 
correlation between them. It is clear that the joint distribution of first upcrossings at different points is heavily 
infiuenced by the correlations of the underlying density fiuctuation field. 

In this paper we will i) compute the joint two-point distribution of the first upcrossing events and use the solution 
to obtain an approximate analytical expression for the halo-halo correlation function, ii) numerically solve suitable 
sets of spatially correlated Langevin equations to directly compute the halo two-point function at different lags. In a 
forthcoming paper we will consider the effects of correlations on the merging histories of dark matter haloes. 

The plan of the paper is as follows. In Section 2 we present a straightforward extension of the excursion set 
approach and review the standard application of this method which allows to determine the mean mass function 
of DM haloes. In Section 3 we study the Lagrangian clustering of haloes identified by our scheme and obtain the 
halo-halo correlation function both from an approximate analytic treatment and from a numerical integration of two 
spatially correlated Langevin equations. Section 4 contains a discussion of our results and hints at some applications 
of our algorithm to generate spatially correlated halo merger trees. 



2 EXCURSION SET APPROACH 

In this section we review the formulation of the excursion set approach, mostly following the treatment given by 
BCEK. 



2.1 Langevin equation 

Let us assume that the linear density fiuctuations form a homogeneous and isotropic Gaussian random field S{x,z), 
with X the Lagrangian comoving position and z the redshift. In linear theory one can factor out the redshift dependence 
as Z)(z)5(x), where D(z) is the linear growth factor (e.g. Peebles 1980). With the normalization D(z = 0) = 1, (S(x) 
becomes the mass density fiuctuation linearly extrapolated to the present epoch. 

The statistical properties of the Gaussian field 5(x) are completely specified by the two-point function in Fourier 
space, which is related to the power-spectrum P{k) by {(5(ki) i5(k2)) = (27r)^i5^(ki -|- k2)P(A;i). The symbol Sj^ 
represents the Dirac delta function, and the brackets (•) denote ensemble averaging. Our Fourier transform convention 
is 5(k) = / dx5(x) e* ". 

We want now to study the statistical properties of the density fiuctuation field at some resolution scale Rf. This 
is introduced by convolving (5(x) by some filter function W(|x' — x|, Rf), 

5(x,i?/) = J d^'W(\x' -x\,Rf)5{^') = J dkW?(fcJ?/)5(k)e"*" , (1) 

where W is the Fourier transform of the filter. At each point x the smoothed field represents the weighted average 
of 5(x) over a spherical region of characteristic dimension Rf centred in x. The detailed properties of 5(x,_R/) 
clearly depend upon the specific choice of window function. The most commonly used smoothing kernels are the 
top-hat filter Wth{\x\, Rf) = 30(7?/ — |x|)/47r_R|, where 0{x) is the Heaviside step function, and the Gaussian one 
WG{x,Rf) = {2nR})-^^'^ ex.p{-x^ /2R}). Recently, for convenience of analysis, top-hat filtering has been also applied 
in momentum space WsKs{k, Rf) — Q{kf — k), where kf = 1/Rf and kf = |k/|. This kernel is generally called sharp 
fc-space filter. While it is easy to associate a mass to real space top-hat filtering Mth(-R/) = iirpbR^' /3, there is always 
a bit of arbitrariness in assigning a mass to the other window functions. The most common procedure is to multiply the 
average density by the volume enclosed by the filter, obtaining MciRf) = {2n)^^^ ptR'f and MsKs{Rf) ~ dTT^pbkJ"^ 
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(Lacey & Cole 1993). An alternative procedure, originally introduced by Bardeen et al. (1986), corresponds to the 
choice MsKs{Rf) = ^i^pbR^jj /i, where Rth is chosen by requiring a%jig{Rf) — a^jj{RTH), and similarly for the 
Gaussian filter. In this way one obtains good agreement with numerical simulations of clustering growth (Lacey & 
Cole 1994). 

In hierarchical bottom-up models of structure formation at each epoch mass fluctuations have typically undergone 
non-linear collapse up to some scale Rf, and the first objects that form are those of lower mass. Haloes of larger mass 
arise by the merging and accretion of subunits. The mass distribution deriving from this hierarchical halo formation 
process has been successfully modeled by Peacock & Heavens (1990), Cole (1991), BCEK and Lacey & Cole (1993). 
In order to mimic the accretion of matter all these models consider a full hierarchy of decreasing resolution scales 
Rf. The effect of varying Rf can be obtained by differentiating eq. (jl|) 

OSi.,Rf)_ 1 /,k5(k)^^^e--.,(x,i?,). (2) 



dRf (27r)3 J ' dRf 

This has the form of a Langevin equation, which shows how an infinitesimal change of the resolution scale Rf affects 
the value of the density fluctuation field S{x,Rf) in the given position x through the action of the stochastic force 
rj{:x.,Rf). In the limit Rf oo one has S{x;Rf) 0, which can be adopted as initial condition for our flrst-order 
stochastic differential equation. Thus, by solving it, we can associate to each point x a trajectory (5(x, i?/) obtained 
by varying the resolution scale Rf. Notice that, since eq. (jij) is linear, r;(x,i?/) is also a zero mean Gaussian random 
field, uniquely specified by its auto-correlation function 

/ r R w R ^\ ^ r ^i. Pfi. dWjkiRfi) dWjkrRf^) . . 
{v{^i,Rfi)v{x2,Rf2)) ^ ^ J dfcifciP(fci) — g^^^' — dRj^^°^ ^^^' 

where r — jxi — X2j and jo{x) is the zeroth order spherical Bessel function. Trajectories associated to different 
neighbouring points will be statistically influenced by the correlation properties of the force ri{x,Rf), i.e. of the 
underlying Gaussian field 5(x). On the other hand the coherence of each trajectory along the Rf direction depends 
exclusively on the analytic form of the filter function and vanishes for the sharp fc-space window (BCEK). With such 
a filter, by decreasing the smoothing length one adds up a new set of Fourier modes of the unsmoothed distribution to 
(5(x, Rf). For a Gaussian field this is completely independent of the previous increments, and each trajectory (5(x, Rf) 
becomes a Brownian random walk. 

In the case of sharp fc-space filtering the notation greatly simplifies if we use as time variable the variance of the 
filtered density field, A = a^ikf) = {S{kf)^) = {27r'^)-^ J^^ dk P P{k). In such a case the stochastic process reduces 
to a Wiener one, namely 
a5(x,A) 

^^=C(x,A), (4) 
with {C(x, A)) = and 

{C(x,Ai)C(x,A2))=<5^(Ai-A2) (5) 

[see eq. (17) for the spatial correlation]. In the following we will adopt A as time variable, unless explicitly stated. 
The solution of the Langevin equation (^ in an arbitrary point of space (the position x is here understood), with 
the initial condition 5{A = 0) = 0, is simply (5(A) = dA'(^(A'). By ensemble averaging this expression one obtains 
((5(A)) = and ((5(Ai)(5(A2)) = min(Ai, A2), which uniquely determine the Gaussian distribution of 5(A). More details 
can be found in Porciani et al. (1996). 



2.2 Press-Schechter mass function 

Press & Schechter (1974) proposed a simple model to compute the comoving number density of collapsed haloes 
directly from the statistical properties of the linear Gaussian density field (see also Doroshkevich 1967). According to 
the PS theory a patch of fiuid is part of a collapsed region of scale larger than M{Rf) if the value of the smoothed 
linear density contrast on the same scale exceeds a threshold t / . The idea is to use a global threshold in order to mimic 
non-linear dynamical effects ending up to halo collapse and virialization. An exact value for tf can be obtained by 
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describing the evolution of density perturbations according to the spherical top-hat model. In this case a fluctuation 
of amplitude S will collapse at a redshift Zf such that 5(x) — tf = 5c/D{zf). In the Einstein de Sitter universe and 
during the matter dominated era the critical value Sc does not depend on any cosmological parameter and is given by 
Sc — 3(127r)^^'^/20 ~ 1.686, whereas, for general cosmologies, it shows a weak dependence on the value of the density 
parameter, the cosmological constant, the Hubble constant, thus on redshift (e.g. Lacey & Cole 1993). 

The PS formula for the mass function has been thoroughly compared with the outcomes of N-body simulations, 
finding general good agreement (e.g. Efstathiou et al. 1988; Gelb & Bertschinger 1994; Lacey & Cole 1994). As 
mentioned in the Introduction, the model is intrinsically flawed by the cloud-in-cloud problem, namely the fact that 
a fluctuation on a given scale can contain substructures of smaller scales and the same fluid elements can be assigned, 
according to the PS collapse criterion, to haloes of different mass. Moreover, in a hierarchical scenario, one expects to 
find all the mass collapsed in objects of some scale, while the PS model can account only for half of it: this problem 
is intimately related to the fact that in a Gaussian field only half volume is overdense. Press and Schechter faced 
the problem by simply multiplying their result by a fudge factor of 2. In this section we review how the Langevin 
equations introduced above can be used to extend the PS theory in such a way to solve both problems. In the next 
sections we will use the same approach to compute the halo-halo correlation function. 

The solution of the cloud-in-cloud problem has been given by Peacock & Heavens (1990), Cole (1991) and BCEK. 
Their approach consists in considering at any given point the trajectory S(Rf) as a function of the filtering radius, 
and then determining the largest Rf at which S{Rf) upcrosses the threshold tf{zf) corresponding to the formation 
redshift Zf. This determines the largest mass collapsed at that point, all sub-structures having been erased. So, 
the computation of the mass function is equivalent to calculating the fraction of trajectories that first upcross the 
threshold tf a.a the scale M decreases. The solution of the problem is enormously simplified for Brownian trajectories, 
that is for sharp fc-space filtered density fields. In such a case one only has to solve the Fokker-Planck equation for 
the probability density VV((5, A) dS that the stochastic process at A assumes a value in the interval 5, 5 4- dS, 
dW{S,A) ^ 1 d^W{5,A) 

dA 2 dS^ ' ^ ' 

with the absorbing boundary condition VV(t/,A) — and initial condition W{S,0) — 5d{S). The solution is well 
known (Chandrasekhar 1943) 

iS-2tf) 



W{S,A-tf)d5 = 



/2^ 



exp I — cxp 



2A / ^ V 2A 



dS . (7) 



Defining S{A,tf) = J^^d^ VV(5, A, t/) as survival probability of the trajectories, one obtains the density probability 
distribution of first-crossing variances by differentiation 



1 dW{S,A,tf) 

2 dS 



V 2A 



^-P -T^ • (8) 



The function Vi (A) dA yields the probability that a realization of the random walk is absorbed by the barrier in 
the interval (A, A + dA) or, by the ergodic theorem, the probability that a fiuid element belongs to a structure with 
mass in the range [M{A + dA), A/(A)]. Finally, the comoving number density of haloes with mass in the interval 
[M, M + dM] collapsed at redshift Zf is 



niM,zf)dM = ^Vi{A) 



dA 



dM . (9) 



n{M,zf)dM^P'^''^''^ 



dM 

Inserting the expression of Vi (A) of eq. (^ in the latter equation one obtains the well-known PS expression for the 
mass function 

dlnA / tfizff\ 

The original fudge factor of 2 of the PS approach is now naturally justified. A generalization of this formalism to 
simple cases of non-Gaussian initial conditions has been given by Porciani et al. (1996). 

Previous investigations (e.g. Peacock & Heavens 1990; BCEK) have shown that only for sharp fc-space filtering 
it is possible to write an analytic formula for the mass function obtained from the excursion set approach. Numerical 



27r 



M2^A(A/) 



dluM 
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solutions of the cloud-in-cloud problem with physically more acceptable smoothing kernels like Gaussian and top-hat 
result in mass functions that are a factor of two lower in the high-mass tail and have different small-mass slopes 
compared with the PS formula. The standard interpretation of this result is that the excursion set method is not 
reliable for M <C M», where M», defined by A(M*) = tf, is the typical mass collapsing at 2:/. 



3 LAGRANGIAN CLUSTERING 

In this section we show how the excursion set approach can be extended to derive the clustering properties of 
dark matter haloes. Specifically, we study the evolution of a set of density fluctuation processes at different spatial 
locations, as the smoothing scale progressively shrinks. The trajectories turn out to be intimately correlated and the 
joint distribution of the first upcrossing filtering radii is used to extract the Lagrangian halo-halo correlation function. 



3.1 Two-point correlation function from joint upcrossing distribution 

Let us select two points in Lagrangian space xi e X2 = xi-|-r. We want to study the evolution of the stochastic processes 
(5i(A) = 5(xi, A) and 52(A) = (5(x2,A) as A varies. In particular, let us suppose that we know the joint probability 
distribution P2(Ai, A2; xi, r) of those pairs of variances (Ai,A2) that correspond to the first upcrossing scales of the 
threshold by the two processes Si (A) and ^2 ( A) . Because of the underlying homogeneity and isotropy, the probability 
density V2 cannot depend on the vector xi and on the orientation of r, i.e. ■P2(Ai, A2; xi, r) = ■P2(Ai, A2; r). Moreover, 
by the ergodic theorem one can identify ■P2(Ai, A2; r) with the probability distribution of the pairs (Ai, A2), obtained 
by randomly selecting points spatially separated by the distance r, within a given realization of the density field. 
Finally, following the arguments given in the previous section, we can interpret P2(Ai, A2; r) dAidA2 as the probability 
of finding two points separated by r within two haloes with mass in the intervals (Mi — dMi , Mi ) and (M2 — dM2 , M2 ) , 
as fixed by the corresponding variance ranges (Ai, Ai -|- dAi) and (A2, A2 -|- dA2). As we will discuss in detail in the 
next sections, the probability density 'P2(Ai, A2;r) can be obtained by integrating the system of correlated Langevin 
equations that describe the evolution of the processes (5i(A) and 52(A). 

In order to compute the halo-halo correlation function we adopt the following procedure. A class of objects is 
selected by the mass interval corresponding to the A-range I = [Amin, Amax]. The probability of determining two 
points separated by r contained within collapsed objects of class / is 

Vii{r)=jJdKidA2V2{Ai,A2;r). (11) 

Similarly, the probability of finding a point contained in an object of type I isVi = Jj dAVi{A). EYom the definition 
of correlation function we then obtain 
f^t^M-^" I /Jz^AidA2P2(Ai,A2;r) 

T^i [/^dA7'i(A)] 
In a similar fashion, considering disjoint classes 7i and /2, 
, , _ 44rfAidA2 7>2(Ai,A2;r ^ 
/,,dA7'i(A)4dA7'i(A) 

We stress that the quantities ^^\"{r) and (,^*^^i^{r) are the Lagrangian correlations of the mass elements contained in 
the collapsed haloes. This quantity has been recently used by Bagla (1997) to study the evolution of galaxy clustering. 
However, we are ultimately concerned with the calculation of the halo correlations £,'11 (r), so we have to properly 
weigh the statistical contribution for each extended halo. The problem shows particularly simple if we adopt the 
PS theory. Indeed, in this case, the sets of points where the first upcrossings occur at the same A are point-like 
disconnected regions (see Bond & Myers 1996 for a different approach). Only in a statistical sense they originate 
collapsed haloes, each contributing by 1/V{A), where V{A) = M/pb is the typical Lagrangian volume of an object of 
mass M associated to the variance A. Therefore, adopting the same procedure used to calculate the mass function. 



C/, (r) = r ... r - 1 ■ (13) 
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the mean number density of collapsed objects of scale A becomes n(A) = 'Pi{A)/V{A). Similarly, for the distribution 
of pairs at distance r, we define 
■p2(Ai,A2;r) 



n2(Ai, A2; r) 



l/(Ai)V(A2) ■ 

The idea is, once again, to allow for the finite size of the haloes. The halo-halo correlations become, respectively. 



(14) 



^11 (r) 



and 



nil _ /rrfAidA2n2(Ai,A2;r) 



[/,dAn(A)]- 



/^^ dAidA2 n2 (Ai , A2 , r) 

dAn(A)/ dAn(A) "^^ 



(15) 



(16) 



3.2 Correlated Langevin equations: sharp k-space filtering 

In Section 3.1 we showed that, in order to obtain the halo correlation function, it is crucial to know the joint 
distribution p2(Ai, A2; r). This quantity can be obtained by solving the system of equations governing the evolution 
of the pair of correlated processes Si{A) and 52(A): 

a<5i(A) 



dA 
dS2{A) 



Ci(A) 

= C2(A) 



dA 

{Ci(A)) = {C2(A))=0 



<5i(0) = 0, 
52(0) = 0, 

^1 and (^2 Gaussian processes , 



(17) 



{Ci(A)Ci(A')) = {C2(A)C2(A')) =fc(A-A') 



(Ci(A)C2(A')) = ^^|^fc(A-A') 



The latter equation, obtained by introducing sharp fc-space filtering in equation (|3|), completes the definition of the 
stochastic field (^(x. A) given in equations (^) and (^. Here ^(r; A) is the linear two-point correlation function for the 
mass density fiuctuations smoothed on the scale Rf = 1/fc/ associated to the variance A 



ar;A)^ — 



dk P{k) jo{kr) 



and one has 
aC(r;A) 



OA 



= jo[kf{A) r] 



(18) 



(19) 



By integrating the above differential equations and averaging over the ensemble one obtains the unconstrained prob- 
ability distribution 



Wr{5i,52;A) = 



27r^A2 -C(r;A)2 



exp 



A{Si+S'i)^2ar;A) SA 
2[A2-e(r;A)2] 



(20) 



This function solves the two-dimensional Fokker-Planck equation associated to the Langevin equation (|17[), namely 



dWr{SuS2;A) 



OA 



, g 9C(r;A) 
dS?, dA d5id52 



yVr{5i,52\A) , (21) 

with initial condition VVr((5i, ^2, A = 0) — (5i) (^2). The problem of finding the distribution of the first upcross- 
ings of the threshold by the binary process {81,82} reduces to that of imposing proper boundary conditions to the 
equation (^l|). As done for the one-dimensional Fokker-Planck equation, we adopt the absorbing barrier approach. 
However, in the two-dimensional case we are considering, the distribution p2(Ai, A2; r) cannot be eventually obtained 
from VVr((5i, ^2; A) simply by diff'erentiation. This is because the whole binary system automatically disappears as 
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soon as when one 'Brownian particle' is first absorbed. Nonetheless, the joint distribution p2(Ai, A2;r) can be in 
principle calculated by a two-step procedure as follows. 

Assuming the same initial condition, one solves the Fokker- Planck equation with absorbing barriers sd, 5i — tf 
and S2 = tf, thus finding the survival probability density for the pairs which have never crossed the thresholds. 
Having found this quantity one can compute the probability current through each point, 



J(<5i,<52;A) 



1 f dWr ^ d^{r;A) dWr 



2\d5i ' OA dh ' OA dSi ' dSi 
On a boundary wall, e.g. Si — tf, where Wr{tf,S2;A) = (implying dyVr/dS2 



0), this reduces to 



J{tf,52;A) = 



dWr 



dA 



dSi 



(22) 
(23) 

(1,0) is the 



(24) 



The flux through any point of the barrier Si = tf is then given by the scalar product J • n, where n 
unit vector perpendicular to the absorbing wall, 

^r{tf,52;A) = ^-^ . 

The quantity Tr(tf,52; A) dS2 represents the probability that the pair of processes {5i,S2) leave the permitted region 
passing through the 'gate' [{tf,S2), {tf,S2 +dS2)] at the time A. This flux contains all the information we need for the 
computation of 7^2 (Ai, A2; r). In fact, once Si has crossed the barrier at Ai, for A > Ai we are interested in studying 
only the evolution of the surviving process ^2 up to its first upcrossing through the boundary S2 ~ tf. Therefore, 
since we are considering Brownian trajectories, free of correlations along the A axis, for A > Ai the evolution of 
S2 is simply governed by its own Langevin equation, and its probability distribution can be calculated from the 
one- dimensional Fokker-Planck equation (^, with absorbing boundary S2 ~ tf, but with initial condition (at A = Ai) 
S2, = S2{Ai\Si — tf). Thus, by simply modifying eq. we find that the distribution of the variances A2 associated 
to first upcrossing events of the threshold by the process S2, given that Si upcrossed the critical level at Ai, is: 



ri{A2-Ai,tf-S2,) = 



{tf-52*) 



2-K (A2 - Al)3/2 



exp 



{tf-52,f 



2(A2-Ai) 



(25) 



The joint distribution 7-'2(Ai, A2; r) is eventually obtained by a convolution 

/•*/ /•*/ 
P2(Ai,A2;r) = / dS2Tr{tf,52;Ai)Vi{A2- Ai,tf - S2) + I dSi Tr{5i,tf; A2)Vi{Ai - A2,tf - 5i) , (26) 

J —00 J —ao 

where the first and second integrals on the r.h.s. represent, respectively, the contributions of those pairs for which 
A2 > Ai and A2 < Ai . 

However, this formal expression is useless unless we solve for the probability density Wr- There are two cases in 
which the calculation of Wr is trivial: one for r ^ 00 and the other for r ^ 0. The general case of finite non-zero lag 
r is far more complex, as we will discuss. 



3.2.1 Perfectly uncorrelated processes 

At infinite lag (r 00) the two processes become independent and the solution of the Fokker-Planck equation is 

Wo^(Si,S2;A) = WiSi,A;tf)W(S2,A;tf) , (27) 

where W{S, A;tf) denotes the probability distribution for a one-dimensional process with absorbing boundary at tf 
given in eq. (^) . This solution consists of a linear superposition of four unconstrained independent density distributions 
deriving from different initial conditions. In practice one has to consider the 'real' initial distribution S^{0,0), an 
image source S^{tf,tf) and two image sinks S^{tf,0) and S^{0,tf), that is to say 

Woo(<5i,<52;A) = g^{Si,52;A)-g^{Si-2tf,S2;A)-g^{Si,52-2tf;A) + g^{5i~2tf,S2~2tf;A) , (28) 

where g{5i, 82; A) = [27rA]~^ exp[— ((5i + (5|)/2A] is the solution of the two-dimensional Fokker-Planck equation with 
natural boundary conditions: lim^.^oo 5(<5i, <52; A) = 0, i = 1,2. Obviously, using eq. ( |26| ) we obtain 

© 0000 RAS, MNRAS 000, 000-000 



10 C. Porciani, S. Matarrese, F. Lucchin & P. Catelan 



P2(Ai,A2;r-->oo)=Pi(Ai,t/)Pi(A2,t/) , (29) 
that, inserted in equations ( [l^ ) and (jl^) or in equations ( p^ ) and (p^, gives, as expected for infinite lag, — ^'^'^ — 0. 

3.2.2 Perfectly correlated processes 

When r the two processes become more and more correlated so that, in the limit, we end up with a one- 
dimensional problem. In this case the solution of the Fokker-Planck equation is 

>Vo(cSi,52;A) = W{Si + S2,4A;tf)S^{Si- 52) , (30) 

where W{5, A;tf) denotes the probability distribution given in eq. (^). Expanding this expression as a superposition 
of Green's functions, by using the method of image sources, we obtain 

>Vo(<Si,52;A) = go{Si,S2;A) -go(5i-2tf,S2~2tf-A) , (31) 

where Goix, y; A) = W{x, A;tf)S^ {x — y). In this case, for the joint distribution of first upcrossing variances, we find 

P2 (Ai , A2 ; r ^ 0) = Pi ( Ai , t J ) 5^ ( Ai - A2 ) . (32) 

Consequently, we obtain ^ff ^ [1/ dAP(A)] - 1 and {/^ dA P(A)/V^(A)V[Xf dA P{A) /V {A)f} - 1. Simi- 

larly, eil]^ ^ and S,'}'^,^ = 0. 



3.2.3 Approximate general solution 

In the limiting cases just discussed we were able to account for the boundary conditions by writing the solution in 
terms of a superposition of image-distributions, each one solving the Fokker-Planck equation. Regrettably, the position 
and the sign of the image sources of probability come out dependent on the correlation between the processes (i.e. 
on r). This fact suggests that we cannot write a general analytical solution by simply applying the image method. 
However, we will build here a simple function that satisfies the absorbing boundary conditions being also an accurate 
approximation for the solution of the correlated diffusion equation. In the next section we will test the accuracy of 
this solution against the numerical one. 

It is evident that, for small separation r <C -R/ (we remind the reader that we change the smoothing radius Rf 
at fixed separation r), the perfectly correlated solution will represent a very good approximation to the true one. 
This can be easily deduced by the following argument. The two Gaussian stochastic processes S(A) = (5i(A) -I- (52(A) 
and A(A) = (5i(A) — (52(A) are statistically independent, i.e. {E(A)A(A)) = 0. The variances of their unconstrained 
probability distributions are, respectively, (t| = 2[A + ^{r;A)] and (t^ — 2 [A — ^(r; A)]. Therefore, for r Rf 
[i.e. for A ^ (^^{r), where (T^(r) denotes the variance of the mass density fluctuations smoothed on the scale r] 
where ^(r;A) ~ A, we have ~ and the probability distribution of the variable A is practically a Dirac delta 
function centred on the zero value. This corresponds to the perfectly correlated situation. However, this regime is not 
interesting for computing the halo correlation function since the condition r <^ Rf implies that the points in which 
we follow the trajectories that upcross the threshold are involved in the collapse of the same halo. 

On the other hand, for r ^ Rf (i.e. for A ^ a^{r)), we can replace in eq. ( |2l| ) ^(r; A) with the unsmoothed linear 
mass-correlation ^m{r). Therefore, eq. ( pl[ ) simply becomes the uncorrelated two-dimensional diffusion equation that 
can be easily solved using the image method. Our ansatz for the full Wr((5i, (^2; A) is then obtained by keeping the 
analytic form of the solution just obtained for r 2> Rf but inserting in it the correlation function ^(r; A) to replace 
its large lag limit ^m{r). Therefore, we have 

Wr{Si,S2;A) = g+{Si,S2;A)^g-{Si-2tf,S2;A)~g;:(Si,S2-2tf;A) + g+{Si-2tf,S2-2tf;A) , (33) 
where 

Qr = , = exp 

27r^A2 -5(r;A)2 
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Figure 2. The approximate solution of the Fokker-Planck equation given in eq. is plotted in the left panel (here A = 1 
and 5 = 0.4). The probability density in the physical region is blown up in the right panel. In the contour plots the continuous 
lines correspond to the levels 0, 0.025, 0.05, 0.1, 0.15 (starting from outside), while the dotted lines refer to the same values 
multiplied by —1. The figure clearly shows that the proposed solution satisfies the absorbing boundary conditions for Si = tf 
and S2 = tf (here = 1.686). Moreover, the orientation of the iso-probability curves in the allowed region (these are practically 
ellipses with their major axes lying on the line Si = 52) indicates that the approximate VV(<5i, 52; A) has the right correlation 
properties. 



By using the symbol we want to emphasize the correlation properties of the adopted Green's function: is a 
correct solution of the diffusion equation, while Q~ does not solve it. However, to satisfy the boundary conditions, 
we need to insert it twice in the solution. The probability distribution given in eq. (^) will be a valid approximation 
to the proper one provided the term 

^^dT^ [Sr{di-2tf,52;A)+ Gr {5i,52-2tf-A)] (35) 

can be neglected compared to the A-derivative of the expression in eq. (^3[). A three-dimensional representation of 
this approximate solution and of its contour levels is given in Fig. ^ Inserting eq. (^sj) into equations (^ and ( p^ ) 
we obtain 



p2(Ai,A2;r) 



4Ai A2 + [Ai A2 - 4(Ai + A2)] A,n) + tj C(r; An,f - e(r; Am)^ 



exp 



tj Ai + A2 



27r[AiA2 -^(r;A„ 
2ar;Am)' 



^2l5/2 



(36) 



2 A1A2 -C(r;A„02 

where Am — min(Ai,A2). By using this expression to compute the halo-halo correlation function between objects 
selected in infinitesimal mass ranges, we obtain 
tobj, \ — epts,-, _ P2(Ai,A2;r) 



r^(r) = r^(r) 



1 



(37) 



Pi(Ai)Pi(A2) 

whose explicit expression is reported in Appendix A, eq. (AI). We can also expand the halo two-point function in 
powers of the filtered mass auto-correlation function. 



y- 

^ n! 

n=l 



6„(Ai)6„(A2)C('-;A„ 



(38) 



where the factors 6,1 (A) coincide with the Lagrangian bias coefficients introduced in eq. (15) of CLMP. It is important 
to stress that eq. (js^ , with the ansatz for V2 provided by eq. (^6|) and Vi given by eq. (^ , represents an approximation 
to the exact form of the halo-halo correlation function which can be obtained by numerically integrating the correlated 
Langevin equations, as discussed in the next section. Other approximations have been suggested by CLMP following 
two different approaches: 1) defining a local halo counting operator acting on the underlying Gaussian density field 
[their equation (14), which is also reported in Appendix A, eq. (A2)], it) using the peak-background split to define 
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the halo overdensity free of the cloud-in-cloud problem down to the background scale. Both these models give the 
same series of Lagrangian bias factors of eq. (js^) provided the lag is a few times larger than the Lagrangian halo size. 
Let us explicitly write the first four bias factors 

(39) 

3 

tfA ' 

15 
^ A2 • 

The linear bias term, that for M 7^ M, dominates the halo correlation at large separation, coincides with the one 
obtained by Mo & White (1996). This implies that, in this limit, haloes with M > M, have 61 > 0, i.e. are biased 
with respect to the mass distribution in Lagrangian space. Notice that 61 can be very large when M ^ A/*. On the 
contrary, objects with M < A/, have — 1/t/ < &i < 0, i.e. are moderately antibiased. In the limiting case M — M*, 
61 = and the leading term of ^'"'^ is proportional to ^(r; Am)^, implying much lower halo correlations compared with 
difi^erent mass ranges. R 



61(A) 
62(A) 
63 (A) 
64(A) 



if 


1 


A 


'~f' 


if 


3 


A2 


A ' 


if 


6tf 


A3 


A2 


if 


lot?. 


A* 


A3 



3.3 Monte Carlo simulations 

In order to check the validity of the approximate solution introduced in the previous section we solved numerically 
for the joint distribution of first upcrossing variances by integrating our spatially correlated Langevin equations. We 
stress that this method gives the exact halo-halo correlation function in the excursion set approach, consistently 
completing the PS analysis of the mass function. 

The stochastic differential equations ( p^ are equivalent to the integral equations 

<5i(A + 7)-5i(A)= /" ''dA'Ci(A'), 5i(0) = 0, 

J A 

(40) 

52(A + 7)-52(A)= / rfA'C2(A'), ^2(0)^0, 

J A 

where the statistical properties of the Gaussian processes ("1 and ("2 are given in eq. ( |l7| ) . The general procedure used 
to solve numerically a stochastic differential equation replaces the equivalent integral equation by its expansion in 
power series of ^7, truncates the series after a selected number of terms and gives a rule for computing each term 
that is considered. To control the effect of the temporal discretization an extrapolation of the results for 7 is 
often required (Greiner, Strittmatter & Honerkamp 1988). Fortunately, in the case of a set of Wiener processes this 
procedure can be greatly simplified. In fact, by integrating over a finite timestep 7, equations ( |40| ) simply give 

f <5i(A + 7)-5i(A) = aii(7,A)Gi , <5i(0)=0, 
(52(A + 7)-52(A) = a2i(7,A)Gi+a22(7,A)G2 , 52(0) = 0, 

(41) 

aii(7,A)^ = a2i(7,A)^ + 022(7, A)^ = 7 , 

L aii(7,A)a2i(7,A) = ^{r; A + ■y) - ^{r; A) , 

where the d are independent Gaussian variables with zero mean and unit variance. This set of equations gives the 
fundamental recipe to produce trajectories that are obtained iterating eq. ( ^ ) at each time step by modeling the 
Gi terms with Gaussian pseudo-random numbers. To generate these normally distributed deviates in first passage 



* Note, however, that in the Eulerian case the linear bias term is non-zero also for M ~ Af« (e.g. CLMP). 
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problems, where rare fluctuations are crucial, it is of fundamental importance to adopt a method that is accurate even 
for less probable events. For this reason we adopted the Box-MuUer algorithm (e.g. Press et al. 1992), that is rather 
slow but produces an unbiased Gaussian distribution (in practice the limited precision of numerical computation only 
affects the extreme tail behaviour). A moderate speeding up (roughly 20%) is obtained by modifying the algorithm 
to use a rejection technique (Knuth 1981; Press et al. 1992). 

To estimate the first-passage time distribution one first solves the discretized stochastic equation starting at the 
initial point and terminates the simulation of a trajectory as soon as the boundary is reached. In order to avoid 
the resulting distribution being influenced by the temporal discretization one has to account for possible intra-step 
crossings. In fact, the conditions iS(A) < tf and 5(A + 7) < tf do not guarantee that the process 5 has ever crossed 
the threshold during the time interval 7.^ For this reason, the simple algorithm of choosing as first-crossing time that 
corresponding to the first step at which S > tf is very inaccurate, unless one uses very small time-steps. This problem 
can be solved by performing a small Monte Carlo test at each time step as shown by Strittmatter (1987). In this way 
we obtain high precision even using larger time- steps, therefore reducing the CPU time. 

For a given power-spectrum, once the value for the critical threshold tf and the lag r are selected, the algorithm 
just introduced gives a pair of first upcrossing variances for each realization of the processes 5i and 52- Therefore, the 
joint probability p2(Ai, A2; r) and the halo-halo correlation function can be obtained by considering a large number 
of realizations. 



3.4 Results 

We present here the results obtained by considering two different scale- free power-spectra P{k) oc k" with n = —1 
and n = —2 in an Einstein-de Sitter universe. In these cases the evolution of clustering is self-similar and the results 
obtained at a particular epoch are representative of the whole history. These two values of the spectral index can be 
thought as typical of any physically reliable power-spectrum on scales relevant for galaxy formation in a hierarchical 
scenario. Adopting a standard procedure we normalize the power-spectrum so that the linear mass variance as 
measured in 8/i~^Mpc spheres is equal to 1 and we impose Sc = 1.686. 

Concerning the selection of the mass ranges that identify different classes of haloes, we optimized their broadness 
in order to balance the CPU time requirement (too narrow ranges turn out to be poorly statistically populated) 
with an accurate description of clustering. For these reasons we selected three different classes of objects for each 
power-spectrum (the first contains objects with M ^ M, , the second has M ~ M, and the third Af <C Af,) by 
requiring that they are roughly equi-populated (in terms of first upcrossing events, not number of haloes). In this 
way we can explore all the expected regimes of Lagrangian clustering (a biased halo distribution for M S> M«, an 
almost unclustered distribution for M ~ Af* and an antibiased distribution for M <C Af«) with approximately equal 
numerical accuracy. 

Table 1 gives the parameters that define the three classes of haloes we selected and the probability of occurrence 
of first upcrossing events in each of them [this is obtained by integrating eq. (^) over the corresponding interval of A]. 
A technical problem one has to deal with is the occurrence of spurious oscillations in the correlation function induced 
by the sharp fc-space filter. In fact, the sharpness of the smoothing kernel unavoidably gives rise to oscillations in the 
mass correlations computed in the Fourier-conjugate space. To exemplify, by convolving the linear density field with 
WsKs{k, Rf) one obtains ^m(r; A) oc [1 — cos(fc/(A)r)]/r^ for n — —1 and (,m{r;A) oc Si(kf(A)r)/r for n = —2, where 
Si(z) = dy jo{y). Unfortunately, this oscillating behaviour affects also This can be easily understood in the 
Langevin formalism, where the oscillations are generated by the unsmoothed Bessel function appearing in eq. <§j\j, 
which is unavoidable in the Wiener process approach. One can try to overcome this problem by replacing in eq. ( ^ ) 
the sharp fc-space filtered correlation ^(r; A) with the one obtained with top-hat smoothing. This is analogous to the 
technique generally applied to compare the mass function predicted by the excursion set approach to the outputs of 



t This has a striking analogy with the solution of the cloud-in-cloud problem given by BCEK. 
© 0000 RAS, MNRAS 000, 000-000 



14 C. Porciani, S. Matarrese, F. Lucchin & P. Catelan 

Table 1. Parameters that identify the classes of haloes. 



Class 






(n = -1) 


in = -1) 


A^min/M. 

(n = -2) 


(n = -2) 




h 


0.45 


1.79 


2 


16 


4 


256 


0.20 


h 


1.79 


4.51 


1/2 


2 


1/4 


4 


0.22 


h 


4.51 


11.37 


1/8 


1/2 


1/64 


1/4 


0.19 



N-body simulations (e.g. Lacey & Cole 1993, 1994). This method strongly reduces the oscillations but, for n > —2, is 
not able to erase them completely at separations comparable to the halo size. In fact, for n = — 1 and top-hat filtering, 
the term 021 defined in eq. ( ^ ) is positive for A ^ becomes negative for A ^ cr^(r) and rapidly approaches zero 

for A 3> o"^(t'), after assuming a minimum negative value. This term plays a fundamental role in the computation of 
the halo correlation function, causing the appearance of oscillations when r is comparable to the Lagrangian radius 
of the given halo. This problem could be totally avoided by coherently adopting from the beginning a more realistic 
(i.e. non-sharp fc-space) window function, as sketched in Appendix B. The price one had to pay, however, is that of 
dealing with a space-correlated set of coloured stochastic processes. In the following we will always use eq. (^) and 
top-hat smoothing to obtain f'^'" . 

In order to compute the correlation function, for each physical separation r we followed the evolution of many 
realizations of the stochastic processes 5i and ^2, until lO" pairs of trajectories crossed the threshold at resolutions 
Ai and A2, both contained in one of the three selected intervals. The lag r is taken in the range 1 < r/7?« < 12 for 
n — —1 and 1 < r/i?« < 40 for n = —2, where is the Lagrangian radius associated to the characteristic halo mass 
M.. 

Each simulation has been repeated several times (20 for n — —1 and 8 for n = —2) using different sequences of 
pseudo-random numbers to build up the trajectories. The values for S,^^{r) shown in Fig. 3 are obtained by averaging 
over the different simulations while the error bars represent the standard deviation of the mean. 

In Fig. 3 we also plot the analytic expressions predicted by two different models: our approximated solution of 
the Fokker-Planck equation in eq. (^6|) [see also eq. (Al)] and the 'counting field' model described in eq. (14) of CLMP 
[see also eq. (A2)]. Let us recall that these two models give rise to the same Lagrangian bias factors, i.e. to the same 
clustering regime, provided the halo separation is a few times larger than their Lagrangian size. Asymptotically, both 
the two analytic forms and the numerical integrations tend to the lowest non-vanishing term of the series expansion 
in eq. (^), which, except for the mass range centered on M,, coincides with the Mo & White (1996) prediction. In 
general, the agreement of both models with the results of the Monte Carlo simulations is remarkably good, except 
for lags of order the halo Lagrangian size. Here the discrepancy between analytical and numerical results becomes 
larger and larger as the ratio M/M* decreases. In particular, none of the two models is able to follow the detailed 
features of the numerical solution at small separations, where, at least for n = —1, the spurious oscillations induced 
by the adoption of top-hat smoothing in eq. ( |4l| ) - which has been instead derived after sharp fe-space filtering - play 
a relevant role. 

We can try to give a more detailed description of the numerical outcomes by introducing in eq. (14) of CLMP 
[which is reported explicitly in Appendix A, eq. (A2)] an extra modulation induced by a decaying sinusoidal term. 
Let us call ^clmp the analytic form for ^'"'^ appearing in eq. (14) of CLMP. We can then introduce the following 
'best-fitting models' 



e''{r)~^CLMp{r) 
1 + ^CLMp{r) 

e'^ir) - icLMp(r) 



Ci cos (C2^ + exp -C4 (^-^^ 
Ci cos (^2^ + Cs^ exp 



r 



(42) 



(43) 



1 + icLMp{r) 

respectively for n = —1 and n = —2. The coefficients Ca (a = 1,...,4), which are found using the Levenberg 
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Table 2. Parameters for the 'best-fitting models'. In the last row the value of C3 is set to n since C2 — causes a degeneracy 
between Ci and cos(C3). This is removed by assigning an arbitrary value to C3. 



n 


Class 


Ci 


C2 


C3 


C4 


-1 


h 


0.24 ±0.02 


2.47 ±0.16 


-2.7 ± 0.3 


0.570 ± 0.016 


-1 


h 


1.69 ± 0.10 


1.34 ±0.07 


0.00 ±0.08 


0.880 ± 0.008 


-1 


h 


21 ± 6 


0.65 ± 0.13 


1.1 ±0.2 


2.49 ± 0.13 


-2 


h 


7.0 ±1.5 


0.35 ± 0.07 


6.8 ±0.2 


1.050 ± 0.012 


-2 


h 


57± 11 


0.08 ±0.16 


1.49 ±0.14 


1.90 ±0.02 


-2 


h 


0.86 ±0.12 


0.0 ±0.2 


TT 


1.475 ±0.011 



Marquardt non-linear least-squares method (e.g. Press et al. 1992) in each mass range, are given in Table 2. The 
main limitation of this 'best-fit' approach is clearly that the coefficients Ca depend both on the shape of the power- 
spectrum and on the halo masses. Although we did not attempt any such approach, it ought to be possible to 
parametrize these two dependences. 

4 CONCLUSIONS 

In this paper we have proposed a model for the clustering of dark matter haloes in Lagrangian space. Our model is 
based on a natural extension of the excursion set approach to the PS theory (e.g. BCEK), namely we accounted for 
the spatial correlations of the linear mass density fluctuation field. 

In particular, the two-point halo correlation function has been obtained by integrating the system of Langevin 
equations governing the evolution of pairs of correlated density processes or, oquivalcntly for a Gaussian mass dis- 
tribution, the bivariate Fokker-Planck equation for the density probability distribution of the same processes, once 
appropriate boundary conditions have been imposed. We believe that the numerical integration of the correlated 
Langevin equations allowed for the most reliable determination of the Lagrangian halo correlation, complementing 
the Press-Schechter inspired analyses of the halo mass function. 

Although we gave explicit results only for the halo two-point function, a generalization to higher order statistics 
would be straightforward. The halo correlation function obtained with the present approach is fully consistent with 
the recent results obtained by CLMP, and its form at large separation reproduces the linear bias relation by Mo 
& White (1996), which has been shown to be in good agreement with the clustering of synthetic haloes in N-body 
simulations. 

As stressed by CLMP, the issue of transforming the halo distribution from the Lagrangian to the Eulerian world 
is a non-trivial one, especially on smaller scales and for smaller mass systems, which are most affected by the intrinsic 
non-linearity of the evolved mass density field and by the occurrence of multi-streaming. Nevertheless, one might 
speculate that a stochastic approach analogous to the one considered so far would be viable also in Eulerian space. 
The main modification induced by the Lagrangian-to-Eulcrian map would in fact be a local modulation of the halo 
formation threshold, namely tf{zf) — > t'f{x,Zf) = tf{zf) — (5mo(x, «/) , with (5mo(x, 2:/) the Eulerian mass density 
contrast smoothed on some background mass scale Mo much larger than the halo one. 

The results of this paper represent a relevant step towards the construction of a local identification algorithm 
for halo formation sites in Lagrangian space, which would allow to depict halo maps starting from low-resolution 
simulations of the evolved dark matter distribution. As a straightforward application of this general idea one can 
apply the Zel'dovich approximation (Zel'dovich 1970) to follow the dark matter dynamics on mildly non-linear scales 
and use it to reconstruct the local halo number density at various epochs (Catelan, Matarrese & Porciani 1998). 

One of the most interesting direct applications of our technique is the construction of spatially correlated halo 
merging trees. State-of-the-art algorithms to follow mass accretion histories are entirely based either on the mean 
one-point distribution of the first upcrossing events Vi (Lacey & Cole 1993; Kauffmann & White 1993; Somerville 
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Figure 3. The Lagrangian halo correlation function ^^'^ in an Einstein-de Sitter universe with two different scale-free power- 
spectra, n = —1 {left column) and n = —2 (right column), is shown for three halo mass ranges. The object separation r is 
scaled to the Lagrangian radius of the least massive halo in each range. The vertical dotted lines, where shown, are placed at 
the Lagrangian radius of the most massive halo in each range. The points represent the mean value of different realizations 
obtained by numerically solving our correlated Langevin equations, while the error bars represent the scatter of the mean. 
The continuous lines refer to the 'best-fit models' of eqs. (ttd) and (|43|). The dashed lines are obtained from our approximated 
solution of the Fokker-Planck equation in eq. ( ^ ) [see also eq. (Al)], while the dot-dashed lines show the predictions of the 
'counting field' model of eq. (14) in CLMP [see also eq. (A2)]. Top-hat filtering is used in all cases. 
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& Kolatt 1997) or on Monte Carlo realizations ol the one-point Wiener process 5(A) (e.g. Tozzi et al. 1996). In both 
cases one has to extrapolate the succession of merging events involving a large number of haloes coming from different 
Lagrangian regions from the knowledge of the average properties of a single mass accretion history. Obviously some 
ad hoc assumptions need to be made in the tree reconstruction process, to supply for the lack of statistical and spatial 
information; examples are the assumption of binary merging in Lacey & Cole (1993) or the recent efforts aimed at 
distinguishing between mass accretion and merging events (Salvador-Sole et al. 1997; Somerville & Kolatt 1997). 

In a more realistic approach, however, each branch of a merger tree should be associated to a different Lagrangian 
position. Thus, spatial correlations of the initial density field would manifest themselves as statistical correlations both 
between different branches of the same tree and between different trees. The importance of this issue was stressed 
already in Lacey & Cole (1993). Somerville & Kolatt (1997) argued that accounting for the complex correlated 
structure of the density field would permit the construction of merger trees starting at high redshift and propagating 
forward in time. 

A first attempt to include spatial correlations in modeling the hierarchical growth of dark matter haloes was 
made by Yano, Nagashima & Gouda (1996) and by Rodrigues & Thomas (1996). The main effort of these authors 
was however devoted at solving the cloud-in-cloud problem when the Lagrangian regions that ultimately collapse into 
haloes are properly treated as extended ones. Their predictions for the mass function imply more high-mass objects 
and less low-mass haloes than the PS expression. 

However, the issue we want to focus on is a very different and more ambitious one. By adopting the excursion 
set theory, we aim at building realizations of merger histories by following the fate of the correlated trajectories 
associated to each branch of the same tree. Since we are not modifying the one-point distribution of first upcrossing 
events compared with the standard solution of the cloud-in-cloud problem (BCEK) our merging histories are a priori 
consistent with the PS mass function. 

Concerning other quantities that characterize the ensemble of merging histories (such as halo survival and forma- 
tion times) we expect that accounting for spatial correlations will lead to relevant modifications. In general, compared 
with the uncorrelated case, we should obtain corrections whose importance depends on the mass and the epoch under 
consideration. Intuitively, we would expect the larger effects on those objects whose mass accretion histories are dom- 
inated by merging of halos with similar masses. In fact, as shown in Fig. 0, the correlation between nearby trajectories 
is important only when the smoothing length is larger than or comparable to the physical separation between the 
Lagrangian points to which the trajectories are associated. In a merger tree the physical separation between two 
branches should be roughly given by the sum of the Lagrangian radii of the haloes measured just before they merge. 
Therefore, the effect of spatial correlations of the density field will be completely negligible if the two Lagrangian 
halo sizes are much different. These and related issues will be analyzed in a forthcoming paper. 
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APPENDIX A: ANALYTIC EXPRESSIONS FOR THE HALO TWO-POINT FUNCTION 

For the sake of completeness we give here the full expression of two approximations for frequently referred to in 
the main text and used in Fig. ^. These formulae represent the results of different generalizations of the PS formalism, 
either in the probabilistic Fokker-Planck formulation of the excursion set approach, discussed in Section 3.2.3, or in 
the local counting field theory given by CLMP. 

The explicit result for the halo correlation deriving from the ansatz introduced in Section 3.2 is obtained by 
inserting eqs. (^) and in eq. (^). Thus, for the cross correlation between haloes selected in the infinitesimal mass 
ranges corresponding to the intervals Ai + dAi and A2 -I- dA2 , one gets 

^Ai A2 + [A1A2 - 4(Ai + A2)] A^) + 4 Am)' - ?(r; A^)' 



exp 



tj (Ai + A2) AnO' - 2 Ai A2 ^(r; A^) 



(Al) 



2 AiA2[AiA2-C(r;Am)21 
On the other hand, the solution in the CLMP model, once their eq.(14) has been recast in the present notation, is 



CT2 /I \ dcj (j'i /I Lu \ duj o"icr| d'^u] 



\f\~~uJ^ \ (1 — tJ^) VtJi (72/ 9(72 (1 — o;-^) Vi72 C7i / 9(71 da\da2 
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aio-2 



exp 



j{l - LU^) + {1 + LU^) ■ 



LU t' 



(J\02 



( — — 



2 



(A2) 



Here Oi = h\^^ and w = ^(Ai, A2; r)/cri(T2 with 5(Ai,A2;r) the correlation function between the Unear mass density 
fluctuation field smoothed with two different resolutions Ai and A2. For sharp fc-space filtering ^(r; Ai, A2) = £,{r; Am)- 
ft is worth mentioning that for separations larger than the smoothing lengths, when the derivatives of ^(r;Ai,A2) 



with respect to Ai and A2 are negligible, eq. (A2) reduces to eq. (AI). Moreover, both approximations asymptotically 
reach the linear bias regime studied by Mo & White (1996). 



APPENDIX B: CORRELATED LANGEVIN EQUATIONS: GENERAL SMOOTHING 

Collecting the results outlined in Section 2.2, we can describe the evolution of an ensemble of pairs of correlated 
trajectories by solving the system of Langevin equations 

^^^^^2(Rf), hm 52{Rf)^0, 

drif Rf—*oo 

{rii{Rf)) = {rj2{Rf)) = , rji and rj2 Gaussian processes , j^g-j^-j 



< ^ 



27r2 /„ ^ ' dRf dR'j 



We wrote here the equations adopting the same notation as in Section 2.2, i.e. using the smoothing radius Rj as 
time variable for the trajectories. In this way we emphasize the role played by the filter in determining the statistical 
properties along and between the trajectories. For sharp fc-space filtering, the quantity dW{kRf) / dRf reduces to 
a Dirac delta function, which leads to the set (^^ of the main text. On the other hand, the advantage of a more 
realistic filter is that the oscillations induced by the zeroth order Bessel function are damped. 
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